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^H ' Abstract: Given two dependent stochastic processes X and Y, and a stop- 

Tf*) , ping time r on X, the tracking stopping time problem consists in finding a 

stopping time ijonV that best tracks r, e.g., so as to minimize the mean 
^_ absolute deviation E|j; — r|. 

~*t1 ' This problem formulation applies in several areas including control, com- 

amunication, and finance. However, the problem is in general hard to solve 
analytically as it generalizes the well-known (Baycsian) change-point de- 

1 I , tection problem for which solutions have been reported only for specific 

settings. 
On ' In this paper we provide an analytical solution to a tracking stopping 

^ ' time problem that cannot be formulated as a change-point problem. For 

\Q the setting where X and Y are correlated Gaussian random walks, and 

7— H , where r is the crossing time of some given threshold, we provide upper 

V^ ■ and lower bounds on inf,, E|?7 — t\ whose main asymptotic terms coincide 

f*^ ■ as the threshold tends to infinity. The results immediately extend to the 

• ' continuous time setting where X and Y are correlated standard Brownian 

V) ' motions with drift. 

o 
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%j . 1. Background 

a- 



The tracking stopping time (TST) problem is defined as follows. Let X = 
{X t }t>o be a discrete-time stochastic process and let r be a stopping time 
defined over X. Statistician has access to X only through correlated observa- 
tions Y = {Y t }t>o- Knowing the probability distribution of (X,Y) and the 
stopping rule r, Statistician wishes to find a stopping 77 so as to minimize the 
mean E|?7. — t\. (Recall that a stopping time with respect to a stochastic pro- 
cess {X t }t>o is a random variable r taking values in the positive integers such 
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that {r = t} G J~t, for all t > 0, where Tt denotes the cr-algebra generated by 

The TST problem formulation, introduced in [8], naturally generalizes to 
continuous time and other delay penalty functions such as ¥,(rj — r) + for a 
fixed 'false-alarm' probability level P(r] < r). Important situations are when the 
observation process is a noisy version of X, a delayed version of X, or represents 
partial information with respect to X — at time t, X t = {X t , Y t ) and Statistician 
observes only Y t = Y t . For specific examples of applications of the TST problem 
related to monitoring, forecasting, and communication we refer to [8]. 

In [8], an algorithmic approach is proposed for discrete-time settings where 
all the -XV s and Y,'s take values in a common finite alphabet (otherwise the 
X and Y processes arc arbitrary), and where r is bounded by some constant 
c > 1. Given the probability distribution of {X, Y) and the stopping rule of r, 
the algorithm outputs the minimum reaction delay E(rj — r)+ together with an 
optimal stopping rule, for all false-alarm probability levels P(r] < t) < a, a € 
[0, 1]. Under certain conditions on (X,Y) and r, the computational complexity 
of this algorithm is polynomial in c. 

What motivated an algorithmic approach for the TST problem, is that it 
generalizes the Bayesian change-point detection problem, a long studied problem 
with applications to industrial quality control that dates back to the 1940's [1], 
and for which analytical solutions have been reported only for specific, mostly 
asymptotic, settings. 

In the Bayesian change-point problem, there is a random variable 9, tak- 
ing values in the positive integers, and two probability distributions Pq and 
Pi. Under Pq, the conditional density function of Y t given Yi,Y 2 , . . . , lt-i is 
fo(Yt\Yi,Y2, . . . , Yt_i), for every t > 0. Under Pi, the conditional density func- 
tion of Y t given Y ± ,Y 2 , ..., Y t _i is fi(Y t \Yi,Y 2 , ..., Y t _i), for every t > 0. The 
observed process is distributed according P 9 which assigns the same conditional 
density functions as Pq for all t < 9, and the same conditional density functions 
as Pi for all t > 9. 

The Bayesian change-point problem typically consists in finding a stopping 
time r\, with respect to {Yt}, that minimizes some function of the delay r\ — r. 
Shiryaev [9, 10], for instance, considered minimizing 

E(r]-9) + + \P(r]< 9) 

for some given constant A > 0. Assuming a geometric prior on the change-point 
9, and that before and after 9 the observations are independent with common 
density function /o, for t < 9, and /i for t > 9, Shiryaev showed that an 
optimal r\ stops as soon as the posterior probability that a change occurred 
exceeds a certain fixed threshold. Later, Yakir [12] generalized Shiryaev's result 
by considering finite-state Markov chains. For more general prior distributions 
on 9, the problem is known to become difficult to handle. However, in the limit 
of small false-alarm probabilities P(i] < 9) — >• 0, Lai [3] and, later, Tartakovsky 
and Vecravalli [11], derived asymptotically optimal detection policies for the 
Bayesian change-point problem under general assumptions on the distributions 
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of the change-point and observed process. (For the non-Baycsian version of the 
change-point problem we refer the reader to [5, 7].) 

It can be shown that any Bayesian change-point problem can be formulated 
as a TST problem, and that a TST problem cannot, in general, be formulated as 
a Bayesian change-point problem [8]. The TST problem therefore generalizes the 
Bayesian change-point problem, which is analytically tractable only in special 
cases. 

Our main contribution relates to the situation where X and Y are correlated 
Gaussian random walks given by X — Y = 0, X t = s ■ t + Y^i=i ^t an d 
Y t = Xt + £j]ti W») f° r t> I and some arbitrary constant s > and e > 0. 
The Vi's and Wi's are assumed to be independent standard Gaussian (i.e., zero 
mean unit variance) random variables. The stopping time to be tracked is the 
threshold crossing moment r; = inf {t > : X t > 1} for some arbitrary threshold 
level I > 0. For this setting, we provide upper and lower bounds on inf „ E|?/ — T\ \ 
that imply 



/ He 1 
infE| ?7 -T,| = W ——— (1 + (1)) (J^oc) (1.1) 

n y 7rs' :i (l + e A ) 

for fixed s > and e > 0. Interestingly, (1.1) is still valid if we let 77 be an 
estimator of t that depends on the entire sequence Yq°° ; causality doesn't come 
at the expense of increased delay in the above asymptotic regime. 

For the particular case where the random walks have no drift, i.e., s = 0, we 
show that E|?y — T;| r = 00 whenever r > 1/2, e > 0, and I > 0, for any estimate 
j] of t\ that potentially may also depend on the entire observation process Y§° . 

The above results naturally extends to the continuous time setting where 
yij—i Vi and X)i=i W* are replaced by two independent standard Brownian mo- 
tions. In particular, (1.1) remains valid for fixed s > and e > 0. 

Section 2 contains the main results and Section 3 is devoted to the proofs. 

2. Problem Formulation and Main Results 

We consider the discrete-time processes 

t 
X : X = X t = Y^Vi + st t>\ 

i=l 

I 

Y : Y Q = Y t = X t + eJ2Wi t>\ 

i=l 

where V\, V%, . . . and Wi, W2, ■ ■ ■ are two independent sequences of independent 
standard (i.e., zero mean unit variance) Gaussian random variables, and where 
s > and e > are arbitrary constants. 
Given the threshold crossing time 

n = mi{t >0:X t >l} 
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for some arbitrary level I > 0, we aim at finding a stopping time with respect 
to observation process Y that best tracks 77. Specifically, we consider the opti- 
mization problem 

infE|77-77|, (2.1) 

n 

where the minimization is over all stopping times 77 defined with respect to the 
natural filtration induced by the Y process. 

To avoid trivial situations, we restrict I and £ to be strictly positive. When 
I = or e = 0, (2.1) is equal to zero: for I = 0, 77 = is optimal, and for s = 0, 
77 = 77 is optimal. 

The reason for restricting our attention to the case where also s is strictly 
positive is that, when s = 0, (2.1) is infinite for all / > and e > 0. In fact, 
Proposition 2.1, given at the end of this section, provides a stronger statement: 
for s — 0, e > 0, and I > 0, we have E|?7 — 77 | r = 00 for any r > 1/2 and any 
estimator 77 = t](Yq°) of 77 that may depend on the entire observation process 
F °° (i.e., 77 need not be a stopping time). 

The following theorem provides a non-asymptotic upper bound on (2.1) which 
is achieved by a threshold crossing stopping time applied to a certain estimate 
of the X process: 

Theorem 2.1 (Upper bound). Fix e > 0, s > 0, I > 0, and define X t as 
X = X t = st + — —(Yt-st) fort>l. 

1 + £ A 



Then, the stopping time r\ = inf{£ > : X t > 1} satisfi 



C,s 



1/4 



2l£ 2 6/ I Y'* 8(s + 2) 20 , , 

y 7T(1 + e^)s cl S \(27TS) J / V 7TS J s 

The next theorem provides a non-asymptotic lower bound on HL\r] — 77 1 for 
any estimate 77 = 77(^0°°) 0I r z that has access to the entire sequence y o °°. The 
function Q(x) is defined as Q(x) = (27r) -1 / 2 J°° cxp(— u 2 /2)du. 

Theorem 2.2 (Lower bound). Let e > and l/s > 2 with s > 0. Then, for 
any integer n such that 1 < n < l/s, the following lower bound holds: 



inf E|?7-77|> 




(2.3) 



When n approaches l/s and l/s tends to infinity in a suitable way, the up- 
per and lower bounds (2.2) and (2.3) become tight. The following result is an 
immediate consequence of these bounds: 
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Theorem 2.3 (Asymptotics) . Let q be a constant such that 1/2 < q < 1. In 
the asymptotic regime where l/s > 2, 

j \ 9-1/2 



and 



s I - | — >oo, 



/A 1 " 9 e 2 



have 



/ 2/e 2 
inf E|7 ? -7 t |=infE|> ? --nl = W 3 2 [1 + o(l)] . (2.4) 

7Kr °°) »? V 7^(1 + e 2 ) 

In particular, (2.4) fto^ds in the limit I — > oo /or /ixerf s > and e > 0. 

To prove Theorem 2.1, we consider 77 = inf{£ > : X.\ c ' > I}, where X^ c ' is 

the estimate of X t defined as X\ = st + c(Y t — st), then optimize over c > 0. It 
should be noted that, in the asymptotic regime (given by Theorem 2.3) where 
the upper and lower bounds on inf,, E|n — 77 1 coincide, the optimal c (equal to 

1/(1 + e 2 )) is the value for which the variance of X t — X t is minimized. 

Let us now consider the setting where J2i=i ^» an ^ Y^i=i Wi are replaced by 
standard Brownian motions, i.e., with the X and the Y processes being defined 

as 

X : X o = X t = Bt + st for t > 
Y : Y a =0 Y t =X t +sN t for t > 

where {B t }t>o and {N t }t>o are two independent standard Brownian motions. 
The previous results easily extend to the Brownian motion setting. Indeed, the 
analysis is simpler than for the Gaussian random walk setting as there is no 
'excess over threshold' for a Brownian motion — the value of a Brownian motion 
the first time it crosses a certain level equals this level. 

Theorems 2.4, 2.5, and 2.6 are analogous to Theorems 2.1, 2.2, and 2.3, 
respectively. 

Theorem 2.4 (Upper bound: Brownian motion with drift). Fix e > 0, s > 0, 

I > 0, and define X t as 



1 

\Te 2 



X o = X t =st + TT -^{Y t -st) fort>0. 



Then, the stopping time r\ = inf{£ > : X t — 1} satisfies 

1/4 



E|i7-7i| < 



2/e 2 6 / I 



7r(l + e 2 )s 3 s \(2-ks) 
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Theorem 2.5 (Lower bound: Brownian motion with drift). Let e > 0, s > 0, 

and I > 0, and let n be such that 1 < n < l/s. Then, 

inf E\r}-n\ > 

n(Y °°) 

' n 

2tT, 

The following Theorem is an immediate consequence of Theorems 2.4 and 2.5. 

Theorem 2.6 (Asymptotics : Brownian motion with drift). Theorem 2.3 is 
also valid in the Brownian motion setting. 

We end this section with a proposition related to the particular case where 
s = 0, which we referred to earlier. When s = 0, e > 0, and Z > 0, it is impossible 
to finitely track 77, even having access to the entire observation process Yq°: for 
any estimate r\ = r)(Ytf°), E(|r/ — r/| r ) = 00 for all r > 1/2. The proposition is 
valid in both the Gaussian random walk and the Brownian motion settings. 

Proposition 2.1. Let s = and let f{x), x > ; be a non-negative and non- 
decreasing junction such that 

E/(r h /2) = 00 (2.5) 

for some constant h > 0. Then, 

i. E/(|r; — rj\) = 00 for any estimate r\ = r\ (Y^), whenever e > and I > 0. 
ii. If f(x) = x r , r > 1/2, i/ien (2.5) ZioZds for all h > 0, whenever e > and 

Z > 0. (Hence, E|r; — ?^| r = 00 for any estimate n = 77 (Y^ 00 ) 0/77 whenever 

r > 1/2, s = 0, £ > 0, and I > 0.) 

3. Proofs of Results 

In this section we prove Theorems 2.1 and 2.2 and Proposition 2.1. Theorems 2.4 
and 2.5 are proved in the same way as Theorems 2.1 and 2.2, by merely ignor- 
ing the boundary crossing overshoot. The proofs of Theorems 2.4 and 2.5 are 
therefore omitted. 

Throughout the paper, V and W denote standard Gaussian random variables. 

3.1. Useful results 

The following result, given in [6, Theorem 2, equation (7)], provides an upper 
bound on overshoot that is uniform in the crossing level I. 

Theorem 3.1 ([6]). Let Z\, Z2, . . . be i.i.d. random variables such thatMZi > 0. 
Define St = Z\ + Z2 + . . . + Z t , /j,i = inf{£ > 1 : St > I}, and R^ = S^ — I. 
Then, 

supE(^) < ^±# 1^! for all p > 0. 
i>o K <*' (p + 1) E(Z?) 



Burnashev and Tchamkerten/ 'Tracking Stopping Times 7 

Overshoot has been extensively studied and various other bounds have been 
exhibited (see, e.g. ,[2, 4]). However, to the best of our knowledge, the bound 
given by Theorem 3.1 is a tightest known bound in the sense that it hasn't been 
improved for all s > and p > 0. In particular, it is tighter than Lorden's bound 
[4] for small values of s. 

While our non-asymptotic results (Theorems 2.1 and 2.2) can easily be im- 
proved with tighter overshoot estimates, our main asymptotic result, Theo- 
rem 2.3, doesn't. 

Corollary 3.1. Let Z\,Z%,... be i.i.d. random variables according to a mean 
s > and variance a 2 > Gaussian distribution, and let St, Hi, and R m be 
defined as in Theorem 3.1. Then, 



supE(# w ) < 2s + 4a, 



and 



I < sEpi <l + 2s + 4<7. 
Proof of Corollary 3. 1 . Since 



(3.1) 
(3.2) 



\Zi\ 



s 2 + a 2 and E^ 4 = E(s + aVf 



6s'a z + 3a 4 



we have 



supE(i$,)<- 



5a 2 - 



2a 4 



from Theorem 3.1 with p — 2. Therefore, 



su P E(fl Ml )< /su P E(i?2 ; 



/>o 



/ ii 



< 



5a 2 - 



2a 4 



< 2s + 4a , 



which gives (3.1). 
Since 



I <ES^ < / + supE(^ 



/>o 



and ISLSfj,, = sK/ii by Wald's equation, inequality (3.2) follows from (3.1). 
Lemma 3.1. The following inequalities hold for all I > and s > 0: 



□ 



\/^b + s + 2, 



(3.3) 



E\sn-l\<\ — +2s + 4, 

TVS 



(3.4) 
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E{X n - sti)+ <\ — + 3s + 6. (3.5) 

V Z7TS 

Proof of Lemma 3.1. Throughout the proof we use [^J to denote the largest 
integer not greater than x. 

By definition, X T[ > I, hence / < EX T[ = sEt; from Wald's equation. Using 
the identity x = x+ — (— x) + , we therefore get 

< E{n - l/s) = E( n - l/s)+ - E{l/s - n)+ , 

i.e., 

E(1-st 1 )+<E{sti-1) + . (3.6) 

We upper bound the right-side of (3.6) as 

E (n - l/s) + =E( n - l/s; ti > l/s) 

<E( Tl -l/s;X Wsi <l) 

= E(r_ G ;G<0) (3.7) 

where G is defined as 

G = Xyi/ S \ - I. 

Since G < J2\=i l V i = V\JM V ^ usin S Corollary 3.1 with a 2 = 1 yields 

E(r_ G ;G<0) <E 



-G 4 

— +2 + -;G<0 
s s 



J-E(V) + + l + - s 



A + 1 + -- (3.8) 

Zirs a s 



From (3.6), (3.7), and (3.8) we get 



E(l - sn) + < E(s n -t) + <jJ-+ 8 + 2, (3.9) 

V 2irs 

which gives (3.3). 

Inequality (3.4) is an immediate consequence of (3.3). 
Since X Tl > /, we have 

E (X n - sn) + < E (X n -l)+E(l- sn) + . 

This, together with (3.9) and the inequality 

E(X Tl ~l) < 2s + 4 (3.10) 

obtained from Corollary 3.1, proves (3.5). □ 



x We use '=' to denote equality in distribution. 
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Proof of Theorem 2.1. We prove Theorem 2.1 by considering estimates of the 
form 



V 



(c) _ 



if{t > 1 : X[ c) > 1} 



where X is defined as 



X, 



(c) 







X 



(c) 



si + c(lt - si) = st + c 



^Vi+e^Wi 



i> 1 



for some constant c > 0. To obtain the right-side of (2.2), we first upper bound 
]g|^( c ) — n\, c> 0, then optimize the bound over c. 

Note that, for c = 0, we have 7/ ) = l/s, and (3.4) gives 



3 



v^-n 



< 



21 



ITS" S 

We now bound E|?/ c ) — 77 1 for arbitrary values of c > 0. Since 

|x| = 2x + — x , 
we have 



(3.11) 



E 



V {C) ~ Tl 



2E (r/ c ) - ti) - E (V c) - n) 



(3.12) 



Applying Corollary 3.1 to n and r] yields 



M c) - n) > 



2s + 4 



hence from (3.12) 



E 



V (c} - r. 



< 2E ( 77 (C) 



(V c) -4 



2s + 4 



(3.13) 



Below, we upper bound E(t/ c ' — t{) + then use (3.13) to deduce a bound on 
e|?/ c ) -Tl\. 

For notational convenience, throughout the calculations we often omit the 
superscript ^ c ' and simply write X t and 77 in place of X). and rf c ' . Similarly, 
we often drop the subscript ; and write r instead of 77 . 

Let us introduce the auxiliary stopping time 

;/ = inf{i > r : X t > I}. 

Note that v is defined with respect to both processes X and Y and that v > 
maxjr;, r}. It follows that 

E(t]-t) + <E(j/-r;r/> r) 

< E (v - r; X T < I 
1 



EX,- X 



r\X r <l) 



(3.14) 
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where the second inequality holds since {ry > t} C {Y t < 1} and where for the 
last equality we used Wald's equation . 

Since the random walk X has incremental steps with mean s and variance 
c 2 (l + e 2 ), from Corollary 3.1 we get 



E[X V -X T ;X T <1\ < 



) < E l + 2s + 4c\/T+^ 



sC T \ JC T \ I 



X T + 2s + 4c\A + £ 2 ~ X r \ X T < X T 



+ 2CVTT7 2 



(3.15) 



< E 

< s + 2c^l + e 2 + E(X T - X T )_ 
hence from (3.14) 

E (V c) - r) < iE(x| c ) - X T )+ + 

Before we compute a bound on E(Xf — X T ) + for general values of c > 0, we 
consider the simpler case c = 1. 

Case c = 1: Here X^ 1 ' = F 4 and j/ 1 ) = inf{t > : Y t > I}. Moreover, we have 
Y t = X t + E\ftW with W independent of X t . It follows that 

E(X T - X T )+ = E{Ey/rW)+ 

= eE(y/r)R(W)+ 
E(v/7) 



< 



< 



2tt 

E 



VW) 



l + 2s + 4 



(3.16) 



where for the first inequality we used Jensen's inequality, and where the second 
inequality follows from Corollary 3.1. 

Combining (3.16) with (3.15) (c = 1) yields 



E(r,V--n) + _ 

which, together with (3.13), gives 

E 



eVl + 2s + 4: S + 2VTT7 1 
\/2ns 3 s 



v {1) -n 



2e^l + 2s + 4 4(s + 1 + Vl + e 2 ) 



V27TS 3 



(3.17) 



Comparing (3.17) with (3.11) we note that for fixed s > 0, if e <C 1, then 
E I7/ 1 ) — 17 I <C E |?/°) — ti I for large values of I. 

General case c > 0: We compute a general upper bound on E(X T[ — X+f)+, 
c > 0, and use (3.13) and (3.15) to obtain a bound on E|?y 



(c) _ 



TV 
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f) (c) 

Let Ui be the incremental step of the random walk X t — Xj: , i.e. 

Ui = (1 - c)Vi - ceWi. 
Given the fixed time horizon n = \l/s\ . we have 

n n Ti 

x Tl - x%> = J2 Ui ~ ^ n <n ^ Yl Ui+ ^ n > n > J2 Ui > ( 3 - 18 ) 

z= 1 i=Ti~\-l i— n-\-l 



and therefore 

n 

E(I r ,-iW) + <E(^P, 



E 



n 

Z{n < n} J2 U i) 



l = T, + l 



E(l{77 > n} J2 Ui 



(3.19) 



=n+l 



We bound each term on the right-side of (3.19). For the first term, since X^=i ^* = 
\/n[(l — c) 2 + c 2 e 2 ]T^, we have 

E (E C/ =vM(l-c) 2 +c 2 e 2 ]E(lO + 



< 



'n[(l- 


- c) 2 + c 2 e 2 ] 


2?r 


'l[(l- 


- c) 2 + C 2 £ 2 ] 



2?r.s 



(3.20) 



For the second term on the right-side of (3.19), since r is independent of 
[/ T+ i, U T+ 2, . . ., we have 



e(-1{t<7i} ^ Ui) =E\y/(n- t)+ [(1 - c) 2 + c 2 e 2 ] VI 



(1 - c) 2 +c 2 e 2 



2tt 



E V / (n-r) + 



W [(1 - c)2+Cae2] E(n-r) + 



2tt 



< 



\ 



[(1 - c) 2 + C 2 £ 2 ] 

2tt 



2tts j 



(3.21) 



where the first inequality holds by Jensen's inequality and where the last in- 
equality follows from (3.3). 
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For the third term on the right-side of (3.19), we have 
E(t{T>n} Y^ u i) <cdE(l{T>n} ^ wA 



12 



i=n+l 



i— n+1 



i— n-fl 



+ (1 - c)+E j 1{t > n} J2 Vi 
Since r and {W^} are independent, we have 

r 

l{r > ?i} ^ W* = V[t-"] + W^ 

■i— n+1 

and a similar calculation as for (3.21) shows that 

T 

l{r > n} ^ W, 



(3.22) 



E 



z— n+l 



< 



1 

\| 2^ 



i 2 

27TS 3 S 



(3.23) 



We now focus on the second expectation on the right-side of (3.22). Note first 
that, on {r > n}, we have 

T 

Y, Vi = {X T -X n )-s{T-n). 

i— n+1 

Therefore, to bound E l{r > n} V^ Vi , we consider the 'shifted' sequence 

\ i=n+l } + 

{St = Xt — X n }t> n , and its crossing of level I — X n . Using (3.5) (with I — X n 
instead of I) we have 

E(l{r>n} J2 Vi) <E([X T -X n -s(T-n)] + ;X n <l) 

\ i=n+l / , 



\I {1 , X,0+ +3.s + 6 

ZTTS 



E(l-X n )+ „ 

J_L_ 2Z+ +3.S + 6 



2?rs 



< 



|l/4 



3,s + 6 . 



(3.24) 



(2tts) 3 / 4 

where the third inequality follows from Jensen's inequality. Combining (3.22) 
together with (3.23) and (3.24) yields 



i—n+1 



i 



i 

2^ 



l{r > n} Y ^jj.- C£ 



2tts 3 



2 

1 + - 

s 



3s + 6 , 



(3.25) 
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and from (3.15), (3.19), (3.20), (3.21), and (3.25) we get 



13 



3 



(>>-t) 



!l\(l-c) 2 +c 2 e 2 } 



N 



1 



2tts 2 



I 2 

27TS 3 S 



\ 



[(l-c) 2 +c 2 e 2 ] 



2tts 2 



(1-c). 



jl/4 



(2tts) 3 / 4 



i 2 

27TS 3 S 



3s + 6 



1 + 



2<VT+£ 2 



(3.26) 



To minimize the first term on the right-side of (3.26), we set c = c = 1/(1 + e 2 ) 
so that to minimize the factor (1 — c) 2 +c 2 e 2 . With c = c we have (1 — c) 2 +c 2 e = 
e 2 /(l + e 2 ) and get 



E 



(V £) -r) 



< 



Is 2 



2tt(1 + £ 2 )s 3 l + e 2 \ 



2tts 2 



/ 2 

1 + - 



2tts 3 



\ 2^(l+e 2 )s 2 



+ 



s(l + e 2 ) 



jl/4 



(2tts) 3 / 4 



/ 2 

27TS 3 S 


+ 3.s + 6 





1 



sVl + e 2 



We further upper bound e/(l + e 2 ) and e 2 /(l + e 2 ) by one and get the weaker 
yet simpler bound 



E (^H + ^/^ 



Is 2 



1 



+ e 2 )s 3 s \(2ns) 3 
Finally, combining (3.27) with (3.13) yields 



1/4 



2(s + 2) 



■4+-. 
s 



(3.27) 



E|7? (e) -r|< 



2/£ 2 



6 / I 



y 7r(l +£ 2 )s 3 S V(27TS) 3 

from which Theorem 2.1 follows. 



1/4 



8i i +2) +10+ 20 



D 



Proof of Theorem 2.2. We prove Theorems 2.2 by establishing a lower bound on 
E|?7 — ti\ for any estimator 77 = rj(Y^°) that has access to the entire observation 
sequence Y£°. 
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Fix an arbitrary integer n such that 1 < n < l/s (by assumption l/s > 2), 
and let us break the minimization problem into two parts as 



inf E\r)-n\ > inf E 

r,(Y °°) r,(Y~) 

> inf E 



rj — n 



r\ — n 



i-x r . 

s 

I — X n 



I — X, n 



Tl 



;Y n <l 



\Y n <l 



-E 



1 



I — X n 

n -\ n 



\Y n <l 



- inf E[\r)-X n \;Y n <l]-E 
s v (Y ( r) 



I — X n 



Tl 



;Y n <l . 
(3.28) 



We first upperbound the second expectation on the right-side of (3.28). Using 
(3.4), we have for X n < I 



E 



I — X r - 



n 



X n ,Y n < I 



< 



2(1- X n 



4 

2+-. 
s 



(3.29) 



Since X n = sn + y/nV and since / — sn > by assumption, we have 

E(Z - X n )+ = E(l - sn - V^V)+ 
< l-sn + VnEV+ 



I 



Hence, from Jensen's inequality 



Ey/(l - X n )+ < y/E(l - X n ) 



< I- 



II 

2tt 



1/2 



and therefore from (3.29) 

I — X n 



E 



n 



;Y n <l 



< 



1/2 4 

-I +2 + -. (3.30) 




To lower bound the first expectation on the right-side of (3.28), we proceed 
as follows. Since X n and Y n are jointly gaussian, we may represent X n as 



X n = y/ne 2 /(l + e 2 )V + c-Y n + d, 

where V is a standard Gaussian random variable independent of Y n , and where 
c and d are (nonnegative) constants (that depend on s and e). Using this alter- 
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native representation of X n yields 



inf E [\r] -X n \; Y n <l}= inf E 77 - y/ne 2 /(l + e 2 )V -c-Y n -d 



,Y n <l 



ns 



- inf E[\r,-V\;Y n <l] 



71£^ 



-(infE|e-F|)P(y„<0 



ne z 



:(i|y|)p(y„</) 



2ne 2 



?r(l 



1-Q 



l- 



vMl + e) 



(3.31) 



where the infimum on the right-side of the third equality is over constant es- 
timators (i.e., independent of Y^°), and where for the fourth equality we used 
the fact that the median of a random variable is its best estimator with respect 
to the average absolute deviation. 

Combining (3.28), (3.30), and (3.31) we obtain 



inf : 



<\V-Tl\> 




yielding the desired result. 



□ 



Proof of Proposition 2. 1 . We prove the result only for the Gaussian random 
walk setting. The proof for the Brownian motion setting follows the same argu- 
ments and is therefore omitted. 

Throughout the proof we fix some e > 0, / > 0, and let s = 0. 

To prove claim i., we show that, for any h > 0, imOyooN E/(|ry — Ti\) is lower 
bounded by E/(t^/2) multiplied by some strictly positive constant. 

The first step consists in removing the 'noise' in the observation process 
Y from time t = 2 onwards, i.e., instead of {Y t }t>o, we consider the better 
observation process {Z t }t>o defined as 

Z o = 

Z 1 =X 1 + eWi = Vi + eWi 

Z t = X t - X t -! = V t t>2. 

Clearly, it is easier to estimate t; based on Z^ than based on Y" °°; one gets 
Y t — Yt-i by artificially adding the 'noise' eWt to Z t , t > 1. Therefore, 



inf E/(|»7-7i|)> inf E/(|»?-7i|). 



(3.32) 
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Given Z§°, estimation errors on n are only due to the unknown value of X\ 
because of the unknown value of the noise eW\ . In turn, given Z^ , it is sufficient 
to consider only Z\ in order to estimate X\ {Z\ is a sufficient statistic for X\). 
Below, we are going to make use of the important property that the condi- 
tional density function of X\(= V\) given Z\ is not degenerated since it is given 

by 




P(x\z) = ^^exp 



./2tt ( 2e 

and since e > by assumption. 

Define C = c\z{) = Z 1 /{\ + e 2 ) - h/2 and D = D (Z x ) = Z x j(\ + e 2 ) + 
h/2 where h > is some arbitrary constant. From the above non-degeneration 
property it follows that 

P(-Xi <C)=¥(X 1 >D) = S 1 =S 1 (h,e) >0. 

Using this, we lower bound 

inf E/O^-tjI) 

by considering the following three hypothesis problem: with probability 1 — 26i , 
Xi is known exactly (hence 77 is known exactly as well), and with equal proba- 
bility 5i , Xi is either equal to C or equal to D (and no additional information on 
X\ is available). More specifically, denoting by rf the value of Ti when X\ = C, 
and by rf the value of 77 when X\ = d, we have 

inf E/(|r? -n\)> mf s {E[/(|t, - r l \);X 1 < C] + E[f(\ V - r l \);X 1 > D}} 

> inf {E[/(|» ? - rf 1);^ < C]+E[f(\r, - t?\)-X 1 > D}} 
n(zs°) 

= 6 1 inf E[/(h-rf|) + /(|^-r, D |)] 
v(z<?) 

> S.Ef ( Tl °~ TlD ^ , (3.33) 

where the second and third inequalities follow from the assumption that f(x) 
is non-negative and non-decreasing. Further, since rf = Tu—c) + and since Ti x — 
n 2 = n-L-h, h >h, from (3.33) we get 

mf s E/(| 7/ -7i|)> W 

wf 

\ z / 

/Til ^i\ n n\ \ 

(3.34) 



n{z??) ' ■ V 

'T(l-C)+ -T(l-D), 



5iE/( 



2 

T(l-C) + -(l-D) + 
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Now, on {D < 1} we have 

(l-C)+-(l-D)+ = D-C = h, 
therefore from (3.34) we get 

inf E/(|T,-7i|)>di<J 2 E/(^) , (3.35) 

where 

5 2 = 5 2 (h,l,e)=F(D<l)>0. 

Claim i. follows from (3.35) and (3.32). 

We now prove claim ii.. Let {B t }t>o be a standard Brownian motion with 
Bq = 0. For I > introduce the crossing time 



.(B) 



inf{t > : B t = I}. 



Since t[ B) < n for all I > 0, had we proved that E/ (r,^'/ 2 ) = oo, h > 0, 
equation (2.5) would be satisfied since f(x) is non-decreasing. 
Now, using the reflection principle we get 

HA B) <t) = 2P(5* >h) = 2Q f A J ft > 0, i > , 



hence 



oo 

Ef{ri B} /2)=2Jf(t/2)dQ^ 



V2^ J t 3 / 2 



/ft/2) fe2/ 



c 



2/ 



di 



fte^ /• /(t/2) 
Therefore, if /(x) = x r with r > 1/2, then E/ fr^ S) / 2 ) = °° for a11 h > °- 



Claim m. follows. 

D 

4. Concluding Remarks 

Wc considered the TST problem with two correlated Gaussian random walks 
(or two correlated Brownian motions with drift) and a threshold crossing time 
to be tracked t\. Non-asymptotic upper and lower bounds on inf,, E|?7 — 17 1 have 
been derived that coincide in certain asymptotic regimes. 
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Some analysis suggests that ideas used to obtain the upper and lower bounds 
given by Theorems 2.1 and 2.2 could be extended to higher order loss functions 
of the form E|?7 — r;| r , r > 1. However, while a more refined estimate analysis 
may result in a tight asymptotic characterization of inf r) M\r] — Ti\ r , simple non- 
asymptotic bounds as given by Theorems 2.1 and 2.2 may be more difficult to 
obtain. 

Finally, extensions of our results to non-Gaussian random walks settings may 
be envisioned. Here a main difficulty appears to be the derivation of a good lower 
bound. In fact, a main step in the proof of Theorem 2.2 (see argument after 
equation 3.30) takes advantage of the fact that X n and Y n are jointly gaussian. 

References 

[1] Anscombe, F. J., Godwin, H. J. and Plackett, R. L. (1947). Meth- 
ods of deferred sentencing in testing the fraction defective of a continuous 
output. Supplement to the Journal of the Royal Statistical Society, 9 198- 
217. 

[2] Chang, J. (1994). Inequalities for the overshoot. Ann. Appl. Prob., 4 
1223-1233. 

[3] Lai, T. (1998). Information bounds and quick detection of parameter 
changes in stochastic systems. IEEE Trans. Inform. Th., 44 2917-2929. 

[4] Lorden, G. (1970). On excess over the boundary. Ann. Math. Stat., 41 
350-357. 

[5] Lorden, G. (1971). Procedures for reacting to a change in distribution. 
Ann. Math. Stat, 42 1897-1908. 

[6] Mogulskii, A. (1973). Absolute estimates for moments of certain bound- 
ary functionals. Th. Prob. Appl., 18 350-357. 

[7] MOUSTAKIDES, G. (1986). Optimal procedures for detecting changes in 
distribution. Ann. Stat., 14 1379-1387. 

[8] Niesen, U. and Tchamkerten, A. (2009). Tracking stopping times 
through noisy observations. IEEE Trans. Inform. Th., 55 422-432. 

[9] Shiryaev, A. N. (1963). On optimum methods in quickest detection prob- 
lems. Th. Prob. and its App., 8 22-46. 
[10] Shiryayev, A. N. (1978). Optimal Stopping rules. Springer- Verlag. 
[11] Tartakovsky, A. G. and Veeravalli, V. V. (2005). General asymptotic 
Baycsian theory of qickest change detection. Th. Prob. Appl., 49 458-497. 
[12] Yakir, B. (1994). Optimal detection of a change in distribution when 
the observations form a Markov chain with a finite state space. In Change- 
point problems, vol. 23. Institute of Mathematical Statistics, Lecture Notes, 
Monograph Series, 346-358. 



